** Data reading and variable selection from raw data
** Japan 2000 National Survey on Family and Economic Conditions (NSFEC) 


** 01. Reading data **

cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/ 


** 02. Consructing year and country variables **

ge year=2000
lab var year "survey year"

ge country=392
lab var country "ISO country code"
//Japan: 392 (see "ISO Country Codes.pdf) 


** 03. ID variables **

ge pid=PERSONID
lab var pid "person id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=SEX
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=AGE
lab var age "age"

ge birthyr=year-age
lab var birthyr "year of birth"


** 05. Siblings **

ge nbro=0 if OLDBRO==1 & YNGBRO==1
replace nbro=1 if OLDBRO==2 & YNGBRO==1
replace nbro=1 if OLDBRO==1 & YNGBRO==2
replace nbro=2 if OLDBRO==2 & YNGBRO==2
replace nbro=2 if OLDBRO==3 | YNGBRO==3
lab var nbro "number of brothers (in category)"
lab def nbro 0 "no brother" 1 "1 brother" 2 "2 and more brothers"
lab val nbro nbro

ge nsis=0 if OLDSIS==1 & YNGSIS==1
replace nsis=1 if OLDSIS==2 & YNGSIS==1
replace nsis=1 if OLDSIS==1 & YNGSIS==2
replace nsis=2 if OLDSIS==2 & YNGSIS==2
replace nsis=2 if OLDSIS==3 | YNGSIS==3
lab var nsis "number of sisters (in category)"
lab def nsis 0 "no sister" 1 "1 sister" 2 "2 and more sisters"
lab val nsis nsis

ge nsibs=0 if nbro==0 & nsis==0
replace nsibs=1 if nbro==1 & nsis==0
replace nsibs=1 if nbro==0 & nsis==1
replace nsibs=2 if nbro==2 | nsis==2
lab var nsibs "number of siblings (in category)"
lab def nsibs 0 "no sibling" 1 "1 sibling" 2 "2 and more siblings"
lab val nsibs nsibs

ge birthorder=1 if OLDBRO==1 & OLDSIS==1
replace birthorder=2 if OLDBRO==2 & OLDSIS==1
replace birthorder=2 if OLDBRO==1 & OLDSIS==2
replace birthorder=3 if OLDBRO==2 | OLDSIS==3
lab var birthorder "birth order"
lab def birthorder 1 "1" 2 "2" 3 "3 and later"
lab val birthorder birthorder


** 06. Own education **

rename EDUC educ
lab var educ "highest education obtained"


** 07. Parents' education: Father and/or Mother **

// Not available


** 08. Own occupation **
rename Q9B occ
lab var occ "respondent's occupation"


** 09. Parents' occupation **

rename Q15D fawork
lab var fawork "has father experienced job difficulties in the past 10 years"
rename Q15E mowork
lab var mowork "has mother experienced job difficulties in the past 10 years"


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Japan 2000 National Survey on Family and Economic Conditions (NSFEC) 

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d

** R's Own Education **
tab1 educ 

** Parental Education **
// not available

** R's Own Occupation **
tab1 occ

** Parental Occupation **
tab1 fawork mowork 

log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nbro nsis nsibs birthorder ///
	 educ occ ///
	 fawork mowork 


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education **
** Own Education **
rename educ educ_cat

ge educ_yrs=9 if educ_cat==1
replace educ_yrs=12 if educ_cat==2
replace educ_yrs=14 if educ_cat==3
replace educ_yrs=14 if educ_cat==4
replace educ_yrs=16 if educ_cat==5
replace educ_yrs=4 if educ_cat==6
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=244 if educ_cat==1
replace educ_ISCED=344 if educ_cat==2
replace educ_ISCED=500 if educ_cat==3
replace educ_ISCED=500 if educ_cat==4
replace educ_ISCED=600 if educ_cat==5
replace educ_ISCED=. if educ_cat==6
lab var educ_ISCED "respondent highest education in ISCED code"


** Parents Education **
//parents education not available


** 14. Homoginising sibling **
//cutoff
ge nbro_flag=2
lab var nbro_flag "cutoff of number of brothers"
ge nsis_flag=2
lab var nsis_flag "cutoff of number of sisters"
ge nsibs_flag=2
lab var nsibs_flag "cutoff of total number of siblings"


** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs 
tab1 nbro nsis nsibs nbro_flag nsis_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace

